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ABSTRACT 


Accurately estimating evaporation is necessary for calculating and 
scheduling irrigation water requirements. Current literature points to 
the use of individual machine learning models for better estimation of 
evaporation. However, such methods have not been used in the 
Indian framework. Moreover, given the diversity of climate, it is 
necessary to develop an ensemble technique incorporating a 
significant number of machine learning algorithms to have a better 
estimation of weekly evaporation. The purpose of this paper is to 
develop an ensemble technique that makes the machine learning 
models that have a better estimation of weekly evaporation. The 
results showed that the Bagging Random Forest model has a much 
better performance in estimating weekly evaporation compared to 


other fitted ensemble models. 
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INTRODUCTION 

Estimating evaporation is essential for managing 
water resources, optimizing irrigation schedules, and 
modeling agricultural production (Konapala ef al., 
2020; Short Gianotti et al., 2020). Besides, 
evaporation has significant importance in studying 
climate change because this parameter scatters a good 
proportion of the global precipitation (Ma, 2018; 
Wang et al. 2019). Few studies have been conducted 
to solve different water resource problems using 
different artificial intelligence approaches namely 
random forest, support vector machine, extreme 
learning machine, feed-forward neural network, 
Gaussian process regression, and, gradient boosting 
model (Hameed et al.,2021; Ghorbani et al.,2020; 
Ashrafzadeh et al.,2018). Ensemble methods are 
techniques that aim at improving the accuracy of 
results in models by combining multiple models 
instead of using a single model. They combine 
multiple algorithms to produce better classification 
and regression performance. Ensemble techniques 
improve the accuracy of fitted models by combining 
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multiple models in place of using a single model. 
Bagging is an ensemble technique that helps to 
improve the performance and accuracy of machine 
learning models. It is used to reduce the variance of 
an estimation model. Bagging avoids the overfitting 
of data and is widely used for regression models. 
Ensemble techniques have been used to develop a 
system of crop yield estimation and _ fertilizer 
recommendation (Kumaravel et al., 2020). Machine 
learning-based classification and regression 
techniques are widely used in agriculture for 
estimating outcomes using datasets that can often 
comprise hundreds of features and observations 
(Kamani et al. 2019, Kamani et al. 2021, Parmar et 
al. 2022). Estimation of wheat crop yield using the 
machine learning techniques and the use of different 
activation functions have been used and 
recommended to choose the activation function 
consideting the research purpose and they type of 
datasets (Shital et al., 2021). The purpose of this 
paper is to determine the effectiveness of the machine 
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learning-based ensemble technique for estimating 
weekly evaporation using weekly weather data of the 
Anand district of Gujarat. 


OBJECTIVE 

To develop an ensemble technique that makes the 
machine learning models for better estimation of 
weekly evaporation. 


MATERIALS AND METHODS 

The present study was undertaken to explore the 
possibility of estimating evaporation by using the 
effect of weekly weather variables. For judging the 
joint influence of weather variables, a week-wise 
approach was considered. The weekly weather data 
for Anand viz; temperature, sunshine hour, wind 
velocity, relative humidity, and evaporation for 43 
years i.e. from 1980-2022 were collected from the 
Department of Agricultural Meteorology, Anand 
Agricultural University, Anand. 


The ensemble is one of the most popular and 
successful techniques in machine learning. Ensemble 
learning is a learning method that consists of 
combining multiple machine learning models. A 
problem in machine learning is that individual models 
tend to perform poorly. The individual models are 
known as weak learners. Weak learners either have a 
high bias or high variance. Ensemble learning 
improves a model’s performance in mainly three 
ways: 
1. By reducing the variance of weak learners 
2. By reducing the bias of weak learners, 
3. By improving the overall accuracy of strong 
learners. 


Bagging (Bootstrap aggregating) is used to reduce the 
variance of weak learners. Fig.1 depicts a conceptual 
view of the Bagging (Bootstrap aggregating) Process. 
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Fig. 1. Conceptual view of Bagging (Bootstrap aggregating) Process 


RESULTS AND DISCUSSION 


An open source weka version 3.8.5 is an ensemble toolkit for data regression and visualization. It was used to 
evaluate the performance and effectiveness of machine learning-enabled ensemble evaporation estimation 
models and 6 ensemble models were built from the bagging technique. 
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Fig. 2. Selected Weekly Weather Variables Distribution 


The Bagging Linear Regression, Bagging Neural Network, Bagging REP Tree, Bagging Random Forest, 
Bagging KNN, and Bagging Support Vector Machines models were used to examine estimating evaporation. 
The result of each fitted ensemble model is checked in terms of R’, MAE, RMSE, RAE, and RRSE. Fig. 2 
demonstrates that the selected weekly weather variables have different distribution ranges. 


Table 1 shows the characteristics of fitted machine learning-based ensemble models. Out of 6 formations of 
ensemble models in this research work Bagging Random Forest has achieved better performance than other 
fitted models. In general, it could be observed that Bagging Random Forest is the best-fitted ensemble model to 
estimate evaporation. 


Table 1. Characteristics of Fitted Machine Learning based Ensemble Models 


Parameters 
a ne Root Mean Relative Root Relative Coefficient of 
acacia Absolute Squared Error Absolute Squared Error Determination 
Error(MAE) (RMSE) Error(RAE) —_(RRSE) (R?) 

Bagging Linear 0.6419 0.8202 33.41 % 35.65 % 87.27 % 
Regression 
Bagging Neural 0.5475 0.7176 28.50 % 31.19 % 90.27 % 
Network 
eee ad 0.5604 0.7362 29.17 % 32.00 % 89.76 % 
Bagging Random | 9 5345 0.7010 27.82 % 30.47 % 90.71 % 
Forest 
Bagging KNN 0.6406 0.8565 33.34 % 37.23 % 86.30 % 
Bagging Support 0.6413 0.8213 33.38 % 35.70 % 87.25 % 
Vector Machines 


Fig. 3 demonstrates the estimation accuracy of different fitted ensemble models. Bagging Random Forest has 
better estimation accuracy than other fitted ensemble models with 90.70 %, followed by Neural Network with 
90.30 %. Bagging KNN has the lowest estimations accuracy with 86.30 %. 


@ IJTSRD | Unique Paper ID —- IJTSRD59847 | Volume—7 | Issue—4 | Jul-Aug 2023 Page 987 


International Journal of Trend in Scientific Research and Development @ www..ijtsrd.com eISSN: 2456-6470 


90.7 

90.3 89.8 
90.0 - 
89.0 - 
88.0 = 87.2 87.3 
87.0 - 86.3 
86.0 - 
85.0 | I] 
84.0 1 


Bagging Bagging Bagging Bagging Bagging Bagging 
Linear Neural REPTree Random KNN Support 
Regression Network Forest Vector 
Machines 


Estimation Accuracy (%) 


Ensemble Models 


Fig. 3. Estimation Accuracy of Different Fitted Ensemble Models 


Fig. 4 shows the error results of the different fitted ensemble models. Bagging Random Forest has the lowest 
Mean Absolute Error (MAE) of 0.53 and Root Mean Squared Error (RMSE) of 0.70. This reveals minimal error 
reported during the estimation of evaporation. Bagging KNN has the highest error with 0.64 and 0.86 of MAE 
and RMSE, respectively. 
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Fig. 4. Error Results of Different Fitted Ensemble Models 


Fig. 5 depicts the Mean Absolute Error (MAE) of the Bagging Random Forest model. The measure of estimation 
accuracy is also called MAE and a low MAE suggests the fitted ensemble model is good at an estimation of 
evaporation. Multiple Correlation Coefficient (R) is a measure of how well evaporation can be estimated using a 
linear function of a set of weekly weather variables. Usually, a higher R-value indicates a better estimation of the 
evaporation from the selected weekly weather variables. MAE (0.53) and R (0.95) values were low and high 
respectively and thus indicated an excellent job by the fitted ensemble model. 
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Fig. 5. Mean Absolute Error of Bagging Random Forest 
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The estimated evaporation is revealed in Fig. 6. It is observed that the actual evaporation and the estimated 
evaporation are very close to each other. The estimated evaporation showed deviations from actual evaporation 


ranging between —1.7 to 2.2. 


Evaporation 


—— Actual 


wn Estimated 


Fig. 6. Comparisons between Actual and Simulated estimation of evaporation using Bagging Random 
Forest 


CONCLUSION 

Bagging is an ensemble machine learning technique 
that helps to avoid overfitting data. It is a model 
averaging procedure that is often used with decision 
trees but can also be applied to other algorithms. It 
was observed that bagging random forest was the best 
fitted ensemble model for estimating weekly 
evaporation by achieving the highest coefficient of 
determination (R*) of 90.71 % as compared with 
other fitted models. The fitted ensemble model has 
the lowest MAE of 0.53 and RMSE of 0.95. There is 
also a significant scope for using the ensemble ML 
technique in estimating the weekly evaporation of 
other data samples and extending the support of 
analysis. Hence, it can be concluded that the study 
helps the researcher in efficient ensemble model 
selection for estimating weekly evaporation. 
Scientific Community has recommended using the 
ensemble technique bagging for estimating weekly 
evaporation. 
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